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ABSTBACT 

The first three reports in the Six Subject Survey 
conducted by the International Association for the Evaluation of 
Educational Achievement reported cognitive and affective outcomes of 
school education in science, literature, ancl"^r€^ding.- comprehension 
for students at the 10-, 14-, and 18-year-old levels in over 20 
countries, four of which were less developed. The discussion^ of 
international evaluation takes place against this background - 
Discussion is given to the purposes of International surveys of 
educational systems, misgivings about the appropriateness of 
employing international evaluation standards, the organization of the 
international evaluation effort^ the mean student performance in 
science and reading in industrialized and nonindustrialized 
countries, and the establishment of research competence in education 
in less-developed countries, (Author/IBT) 
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1. PurpoBas of International aurveys of eduoatlonal ayatems 

In 1973 the flrat three reporta from the ao-called Six Subject Survey 
conducted by the Intematloiial Aaaoclatlon for the Evaluation of Educational 
Achlevemeiita (lEA) were publlahed (Comber and Keevea, 1973; Purvea, 1973; 
Thomdlke, 1973). Tbey reported cognitive and affective outcomes of school 
education in science, literature, and reading cooprehenalon for attxlents at 
the 10-, 14-, and l8- year-old levels In some twenty countries. Moat of these 
countries were highly Industrialized, but four leas developed countries (LDC's) 
alao participated. Three more eubject areaa will be reported in the near futiAre; 
namely Engllah and French aa foreign laxxguagea and civic education. 

The evaluation waa carried out by giving repreaentative aamplea of 
atudenta at the three levela mentioned achievement and attl^tude teats devlaed 
by International comnlttees who had spent , three years deaignlng and trying 
out these Inatrumenta. Thi TEA qtathematica project conducted in 1962-65 
(Huate, 1967a) waa the flrat attenqpt on a large aoale to obtain objective 
meaaurementa of atudent performance for a broad array of countriea, all 
of then, however Induatrlaliaed. 

Hie lEA survey haa been a huge enterprise In terms of time, money and 
number of Individuals Involved. Hie Six Subject Survey comprised acme 
230,000 atudenta .in 9» 700 aohoola. One could. Indeed, aak what kind of 
rationale could be advanced for auoh a project with all ita technical ooin- 
plexltiea iukI far reaching admlniatrative compli cations. 

VDien the lEA research was launched some 15 years ago, the National 
Centres Involved simply wanted to take advantage of International variability 
with regard to both the outcomes of the educational systems and the factors 
which accounted for differences In these outcovies. In a way, the world waa 
conceived of aa one big educational laboratory where different practicea 
in terma of aohool organization, curriculum content and methoda of inatruc- 
tion were experimented with. But before trying to analyae croaa-nationally 
the 'effecta' of varioua input factora rn educational outcomes, it was 
necessary to devise internationally valid evaluation instrtnents . Not 
until the lEA research was launched were such instruments available. 

r 

Therefore the prime concern during the first years of lEA research was 
the construction of appropx*iate measuring techniques that could result in 
the establishment of adequate international yardsticks. These were, indeed, 
badly needed, not least for evaliiating school reforms in all the countries 
and particularly technical assistance programmes in education the LDC^s. 
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/ Pure * head-counting*, for Instance enrolment and graduation statistics (see 

e.g. Harbison and Myers, 1964), ^^s often used as a criterion of evaluation, * 
Ihcklng qualitative indicators, such as student competence achieved In 
various subject areas. Ttie efforts at the beginning of the lEA research to 
devise instruments by means of which international standards could be 
established unfortunately gave some people the' false impressioft that the 
main^>purpose of the exercise was to conduct some kind of international 
horse race or 'co©iitive Olympics^ But the development of new evaluative... 
techniques and the setting ^ of the^' international co-operative machinery 
that went with It were preprequisites for establishing international 
standards in a series of subject areas, such as mathematics and reading. 
Not until the lEA reaaing sui-vey, which also comprised three LDC's 
^{ChXle, India and Iran), were any comparative assessments of the level of 
^literacy among representative grpi^ps of students in suoh countries available. 
Once suitable measuring instruments were available, the next step was 
to identify the salient factors which accounted for cross-national differences. 
Since this could be done in a replicative way at the various levels of the 
single national systems, and across these systems, a much more multi -faceted 
picture of factors accounting for differences in^ student attainment between 
school systems could be obtained, ihe comparative approach implied that we 
widen the population of claissrooms from one particular school within one 
particular national system to a representative set of classrooms within 
several national systems. lEA shared the ambitions prevalent in the social 
sciences in general, t^hat is to say, to arrive at generalizable findings. 
By repeating surveys and analyses over many countries which differed with 
regard to important social and economic factors, a more detailed picture 
of what accounted for differences in 'productivity* between these systems 
could be arrived at. since the ultimate aim of research in the social 
sciences is not only to identify and describe but to e3q)lain and predict, 
that is to say, to generalise, the basis for such an operation can be 
broadened by including mter-system and inter-country variables which allow 
cross-national generalisations and also make it possible to study how intra- 
system and inter-system variables interact. 

We can take as an illustration how class size is related to student 
performance. Practically all the sample surveys so far have been carried 
out in the United States and some West European countries. These studies 
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oonslatently Indirclite that class size and performance tend to be positively 
correlated at the level of O.lO-to 0.20. The fact, however, that class^lze 
within these countrles^c overs a rather narrow range makes generalizations 
about such a relatlonahip extremely awkward. In a nulti -national stiidy one 
can take Into account variables such as teacher competence, school resoux^ces, 
and socio-economic structure, which vary widely between countries. This 

provides an opportunity for obtaining not only a more diversified descrtfitlve 

- ■ - ■ <^ 

picture but also for opening up new avenues of analysis. 

One overriding purpose of tiie lEA Six Subject Survey has been to study 
the relationship between input factors in the social, economic and instruc- 
tional domains and output as meas\u?ed by international tests covering both 
cognitive (student. performance) and affective behaviours (student attitudes 
ecnd motivation). These relatlonahips have been studied in some twenty 
national systems of education and, as a rule, at three different levels 
within each a^ystem. . 

After the completion of the lEA mathematics survey^ two intematj.uiial 
meetings resulted in the report, "Toward a cross-national mbdel of educational 
achievement in a national economy" (Super, 1970) . Die aim was to^develop an 
input^output model that could serve as a theoretical framework for the nex^ 
survey j» irtiere achievement criteria from six s\ibject eu?eas were going to be 
developed. Researchers from the various social science disciplines were 
broui^it together to review both national and international research already 
undertaken and to advance new hypotheses which could be tested in further 
resear<di. .They were also asked to suggest the inclusion of independent 
veuriables to a social and economic natiore that should be Included ,ln the 
proposed sujrvey. : - . ^ 

A key problem in conducting cross-national evaluation studies, where 
comparisons are made between student performance by means of standardized 
achievement test, has to do with congDarabllity per se (Hus^n, 1967b). Two 
major compfia^ability pi^oblems are encountered: the drawing of strictly com- 
parable samples of students and the construction of measuring instruments 
that are 'fair* in terms of their content, matching the students' opportunity 
to learn the subject-matter tapped by the tests. The technical aspects of 
these problems have been dealt with in detail in the lEA international reports 
(see, e.g. Peaker, in press; Comber and Keeves, 1973, p. 42 et seq ). lEA has 
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• auooeeded in establishing a system whereby national random samples, be 

they age samples or grade samples, oan be drawn. Once the target populations 
have been defined (e.g. 14 year-olds) and the sampling design has been drawn 
up, the problem of exeoutlng the sample is mainly an administrative one^ 
In several countries, both developed and less developed, the conduct of the , 
Six Subject Survey was the first occasion when nationally representative 
samples of students were drawn. The experiences gained in countries like 
Iran and India, for instance, can be drawn upon in the future when procedures 
of evaluating entire national systems by means of random samples are going 
to be established as routines. 

^' Misgivings about appr opriateness of employing international evaluation 
standards 

One criticism levelled against the lEA mathematics study by mathematics 
educators in a special issue of the Journal for Research in' Mathematics 
Education (Pindley. 1971) was a lack of comparability due to considerable 
differences between countries m terms oFthi~liliOTint of exposure to the teach- 
ing of the various topics covered by the -items in the international mathe- 
matics tests. Country means of teachers' ratings of 'opportunity to .learn' 
and student achievement are indeed rather highly correlated over countries 
(see. e.g. Comber and Keeves. op.cit . p.158 et seq .). But it should be 
kept in mind that rank order correlations between country aggregates could 
be quite high, and indeed are. When countries were correlated over item 
difficulties, it was found that the overlap in achievement structure was 
remarkable, that is to sAy, country differences were only to a minor 
extent accounl^d fojr:.by.dx^atic differences in particular topics or^ sub-areas 
within one subject but rather by systematic differences over the whole range 
of topics and items. At least in subjects like mathematics and science, 
where 'the subject matter by its very nature is rather universal., the differences 
between national systems seem to affect all topical areas, in a systematic way 
and not Just a few. 

The machinery that went with the construction of tiie international 
achievement tests in a way served as a safeguard 'against undue cultural bias. 
An international qomittee was set up for each subject area. These conmittees 
being composed of subject matter specialists, teachers., test developers and 
curriculum specialists, were responsible for the construction of the test 
instruments and for the development of questionnaires related to ttieir 
respective fields (see, e.g. Comber and Keeves. 1973, p. 27 et seq .). • 

8 



Contact with the p*rtlolp»tlng oo'ontrlea was effected through the national 
r«Bearoh centres and aubjeot coamitteea set up, in each country. ITie analyses 

of the curricula, the proposing of test exercises and the try-out of the 
Items were carried out In the participating countries. lEA headquarters 
served only as a coordinating centre and a clearing house. 

Since the main purpose of achievement^ tests Is evidently to measure 
differences In achievement, complete equality In terms of exposure to 
teaching and opportunity to learn would make thp administration of such 
tests rather pointless. The same applies to so-called intelligence tests, 
where individual and group differences unavoidably also reflect differences 
in terms of opportunity. As has been spelled out in anoth«^connexlon ' 
(Hus^n, 1967b), the international administration <of achievement tests 
differs only In degree and not in principle from their administration on 
a national scale. Within a given country there are differences 1)etween 
school districts and regions due both to differences in student backgrdund 
and school resources. Very few are those who would dispute the worthwhile- 
hesa of administering the same test of achievement to all the children at 
the same grade level in a given country, once the teat measures the main 
objective it is purported to measure. For instance, the finding within a 
given country that children in- urban areas porfonn better than children 
from rural areas or that the. socially privileged have higher scores than- 
the underprivileged is per se not to be Interpreted as an -act of discrimi- 
nation against those who socially and pedagogically have been sCibJected to 
the less favourable conditions. -.The establishment of f«c,tual differences 
in terms of agreed criteria of performance is in itself of •informative 
value. It can, as in the case of the lEA research, serve as a basis for 
analysis of what factors account for dif fep^noes In performance and can , 
ultimately be used for mbre adequate -educational policy. The data collected 
can also serve as a^^basis for evaluating how far students have been brought- 
under the prevailing conditions and for analyses of what could be done in 
order to improve these conditions. 
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The rationale indicated above also applies to comparisons between 
•highly industrialiied and more or less agricultural economies, in brief, 
to ooaparisons Utwen developed and IDC*s. So far, no representative 
comparative informatibh with regard to stOfdent competence in IDC's has 
been available. Those who have first-hand experience have intuitively 
felt that differences between students who grow up in countries where 
there is a long tradition of literacy, and those whose parents in most 
cases are illiterate, are sometimes quite significant. 

Misgivings have been expressed in some quarters about the worth- 
whlleness of an exercise where national school systems in UC's have been 
evaluated according to ^he same standards as those in^^the industrialized 
countries w^th their tradition of universal formal schooling .that how is' 
some huxxir«i years bid or more. Thes^ misgivings range all the yit^ from 
objections 'about 'comparing the incpjuparable' to pointing, out that the- 
ice's can be expected to suffer from certain/handicaps because of the 
.format and ntethodology erriployed in* conducting the .evAluation. 

It would in this connexion take us too far to discuss in detail • - 
the adequacy ^ or lack of adequacy -/of the methodology. V I s>iall t^re- 
fore limit inyself to spelling out the rationale 'for establishing a coitmon • 
standard of achlevetnent In an attempt' to evaluate national" systems of 
education In both industrialised and. non-lhdustrlallaed countries* the 
latter allegedly attempting to develop their- economies in the. same . " 
direction .as the former. I shall also' point out certain .flaws evidenced by 
the- SiJc Subjeet Survey, which - it should oe kept in Jlnd - was the first 
systematic attempt to evaluate priipary and secondary education in IDC 's 
iccording to some kind of international nor^is. 

The introduction of universal elementary schoo'ling in, for instance. 
Western Europe during the 19th Century,, when in most countries certain 
baoij schooling by stAte legislation was^riiade compulsory (frequently with 
• ' ■ ' ' ■ ■ } . 
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opposition from Ithe peasants), has to be viewed In Its economic context. 
Most. of the countries were In the midst, or at the beginning, of a 'great 
leap' in industrialisation. Apart from the task of taking care of children 
in urban areas whose parents were working long hours in the factories, the 
school was supposed to provide the literacy and numeracy required by the 
labour force in Industry. To be sure, most IDC 's are not yet at the 
stage of industriaiization reached by the West European countries by, say, 
1870. Subsistence from agriculture is still far more widespread, and 
this of course raises some doubts about the adequacy of institutionalized 
elementary schooling when the children in rural areas by tradition work ' 
with their parents. But If the goal behind the efforts to build up an 
educational system in the IDC's is to achieve 'more modemlxation' , that is, 
among other things, to build up an infrastructure of knowledge and skills 
oooduclve to an econoaic development which has radically changed the standard 
of living in the industrialized countries, then much can be said for attempts 
to assess the competence achieved in, for instance, reading and science that 
is basic to modern technology. Such competencies have been defined in the 
lEA. survey as the result of cooperation between the participating research 
institutions in both developed and developing cotantries. Instruments for 
their measurement were constructed and tried out conjointly before being • 
■ admlnlaiered to ^Representative samples of students in the respective countries. 
. Hie format of the achievement tests employed constituted a. serious 
handicap for students in the IDC's. Psychological studies have shown that 
children brought up in cultures where sustained efforts in pursuing assigned 
tasks have not been an everyday part of their training have difficulties ^n 
mobilizing the motivation that is required to complete a test examination with 
increasingly more difficult test exercises. The tests were so-called paper- 
and-pencil ones, that is, the students had read the exercises and then 
respond by blackening the. space on an answer sheet that corresponded to the 



correct alternative, of which there were five aa a rule. The most serious 
drawback ajnong students at the 10- and l^-year-pld levels was their frequent 
lack of the reading competence necePsary to understand the test. exercises. 
A high proportion of those who either gave wrong responses or omitted reji- 
ponses did so because they were unable to understand the questions. Thus, ' 
one Important lesson "learrted from the Six Subject Survey Is that ixi evalua- 
ting cognlti-ve competence, , be It skills In the three R* s or basic Items of 
Information In the content subjects-, such as science or clvlos, one would 
. have to develop new formats for the examinations which would reduce the 
.handicap Inherent In a low level of reading oompetenoa. On the other hand, 
Blrtce a certain level of reading comprehension Is Instrumental In 
acquiring knowledge In other subject areas. It could be argued that 
lack of sufficient skill in reading should not be. regarded as a serious 
huidllcap. ' . 

I am fully aware of the objections raised by some of my 
colleagues In the International assistance agencies that the com- 
parisons have been .'Invidious', )«»oause they might not have taken 
fully Into account the explanatory factors underlying the very sig- 
nificant differences, in achievement between, developed and less 
developed countries. A more 'pluralistic'^ approach would have seemed 
to be in order. Apart from the fact that the lEA survey, as was 
emphaal2«i above, was not Intended to be an international Olympics, 
the crucial point is to what extent it is Justified to apply one 
standard of comparison across countries. so different in their social 
and economic structure,' not to speak of the tremendous differences 
in culture and traditions. The point, made above for the 'unl- 
dlmenslonal' approach is thaflf one wants to achieve 'moderni- 
zation', then certain consequences are entailed, such as the 
establishment-;' of certain competencies conducive to industrialifiation. 
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3. Organisation of International evaluation of educational outcomeg 

It was pointed out above that to conduct mu It 1 -national evaluation 
surveys Is, Indeed, a complicated task. A basic prerequisite Is the setting 
up of 'Some kind of maqhinery that cem secure the necessary co-ordination 
and comnunlcatlon between the participating research Ins-Jiltutlons. The 
national research centres have to take decisions about subject areas and 
problems they want to Investigate. A uniform design guiding the constinic- 
tlon of Instruments, data collection and data processing has to be.lali 
down. A timetable for all these activities has to be agreed upon. Since 
several languages are Involved - In the Six Subject Survey no less than 
14 - pi^)blems of translation of tests and manuals of Instruction have to be 
properly handled. For lnst£utce, to what extent is it possible to avoid 
tJulturaX biases when i^ests of reading comprehension are^ constructed:, 
translated and given in vastly different culti^ral settings?, TSiis problem 
is a challenging research task in. its own. It was dealt with in the 
fea3i bill ty study and was further elucidated in the Six Subject Survey 
,when reading tests were given to students in three developing countries 
(niomdike, 1973)* However, communication problems are not solved ty 
penetrating language barriers only. Differences in national values and 
habits can cause difficulties, not least with regard to promptness - • 
or Jack of promptness - in responding to letters or sticking to timetablesi 

Since lEA constitutes the largest network of co-operating research 
institutes conducting empirical research in education in the world today. 
It would seem iffi order to describe briefly its organisational features. 

In 1959j a group of researchers from twel'^O^cotin^ries, who convened 
under_UNESC.O-ja^ decided to embark upon a small pilot study to examine 

to what extent it. was feasible and meaningful to undertake multi-national 
'standardised' survey research; The pilot study turned out^^to be rather 
successful in both respects. It was^ possible in a series of subject areas 
to construct achievement te^ts that could^^ be translated and administered ' 
tiniformly to students in; different countries and to arrive at meaningful 
interpretations of between-country differences (Foshay, I962). . It was 
administratively €md technlcally'feasible to collect data uniformly and to 
have them processed in one place. Therefore, it was decided to undertake 
a more rigorous study using probability samples from twelve countries, 6f 
which all were industrialized (Australia, Israel, Japan, the United States 
and eight West European cotmtries). Student achievement in mathematics 
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was chosen as the main criterion of output, since this subject by its 
universal nature seemed to be more .readily accessible to international 
comparisons than other subject areas, possibly with the exception of 
science. - 

In the lEA mathematics study, tWo major levels in the school systems 
of the twelve countries were sampled (Husen, I967): 

(a) l>ye&r'-olds (both age and grade populations), since this was 
the last point In all the systems where 100 per cent of the 

relevant age group was still in full-time schooling; and 

(b) pre-universi-ty grade students. 

In all 133,000 students were tested and completed questionnaires in 
mathematics study. Furthermore, 13,500 teachers and 5,450 school principals 
completed questionnaires with information on instruction, curriculum and 
school resources. The information gathered in this survey was used to 
t6st hypo-theses concerning: (1) the relationship between different 
. teaching practices in school and outt^omes of instruction; (2) the relation- 
ship between the organisation-al features of the systems, such as age of 
school entry, grouping practices, and student- tea oher ratio, to outcomes; 
and, (3) the relationship between home background and outcomes. Several ' 
special studies, for instance one on the relationship between the 'yield' 
and certain organisational features (Postlethwaite, I967), were also 
conducted. 

After the completion of the feasibility study and the first main 
study (in mathematics) the participating research centres in I967 formed a 
corporate body. The main reason for this was to establish lEA as a legal 
entity eligible for research grants. Thus, lEA is now an international 
non-profit-making, non-governmental a_.ociation constituted under the name 
of the "International Association for the Evaluation of Educational 
Achievement". ' 

The Association is constituted in accordance with the Belgian law 
of 1919 regarding international non-profit-making, scientific societies 
and Which was modified: 1^ a law of 195^. lEA has from its inception had 
close relationships with the United Nations Educational, Scientific and 
cultural Organisation (UNESCO). The feasibility study and the mathematics " 
survey were conducted under the auspices of the UNESCO Institute for 
Education in Hamburg, where the lEA working headquarters were located until 



1969. At that date they were moved to Stockholm and are at present 
acconinodated within the Institute for the Study of International Problems , 
in Education in the University of Stockholm. lEA has a consultative 
relationship with UNESCO. 

Membership in lEA is restricted to institutions carrying out research 
in education. In order to be eligible for. membership an institute should 
have a good reputation, qualified staff, ready access to schools in the 
national school system and the necessary financial resources to cea?ry out 
the research work to which the institute has committed itself. Membership 
is upon application decided\upon by the lEA Coixncil, which is--made up of one 
representative from each national oentre. The number of members is at 
present 23* consisting of ten West European covintries (Belgium (with the 
Flemish- speaking and French- speaking parts being treated as two separate 
entities), Fiederal Republic of Gennany, Finland, France, Ireland, Italy, 
Netherlands, Scotland, Sweden and' the United Kingdom), three East European 
countries (Hungary, Poland and Romania), and nine non- European countries 
(Australia, Chile, India, Iran, Israel, Japan, New Zealand, Tliailand and 
the United States). 

. The Council meets, in principle, once a year and detennines the 
general policy of the Association. It elects a Chairman and a Standing 
Committee consisting of six of its members. The Standing Committee elects 
two of its members to serve with the Chairman on the Bureau, which meets 
several times a year and is responsible for the execution of decisions taken 
by the Council. The cehtre staff employed by lEA consists of an Executive 
Director, . research officers, technical assistants and secretaries. During 
the Six Subject Sui?vey two data processing units were established, one in 
New York for the first stages of processing and one in Stockholm for^rther 
proceissing and the statistical analyses. A data bank has been established 
at the University*, of Stockholm. 

In conducting the Six Subject Survey, the Council had to establish 
various bodies for conducting and reporting the research. As mentioned 
above, one international committee was appointed by the Council in each 
*sulOect area in which survey research was undertaken. Further, the Council 
set up a technical committee which was responsible for overall decisions 
taken on technical problems pertaining to sampling, data collections and 
data processing. The international coimitt.ees interact with national- 
connlttees set up in the various subject areas. For example, during the 
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lEA Six Subject Survey some 300 persons spread across 19 countries with 
l4 different languages were engaged In the construction of Instruments. 
During the mathematics study English and French were used, as linguae 
operandi at International meetings and In correspondence, but In the Six 
Subject Survey It was decided to use only English. 

In the Six Subject. Survey the data for 250,000 students were m'^.de 
available to the data proceaslng centre on either cards (in most cases) 
which could be optically scanned (MRC cards), tapes or punched cards. The 
MRC card-reading took place In Iowa City. The editing, sorting, flllr^. 
Item analysls-and run-off of univariates was done In New York at Columbia ' 
University, and the blvarlate and multivariate analyses were conducted at 
the-Unlverslty of Stockholm. Data on some 2,000 variables were collected, 
most of these being Input variables. The variables In any one subject area 
at any one level of the school Gystem amounted to betwe. X) and 500. To 
be sure, there were too many to be manageable In multivariate analyses and 
they had to be considerably whittled down on the basis of analyses of the 
Intercorrelatlon matrices. 

^- Mean perfownance In science and reading In Industrialised and 
non- Industrialised countries 

The following three target populations were sampled In the Six 
Subject Survey: 

Population I : All students In full-time schooling aged 10:00-10:11 
Population II: All students In full-time schooling aged 14:00-14:11 
and 

Pbpulatlon IV: All students In the terminal year In full-time 
secondary school programnes which were either 
pre-unlverslty programmes or prograrranes of the same 
length (this gave the national centres some 
latitude of Interpretation, which means that In 
some countries only those students who were about 
to complete courses which In a narrow sense qualify 
for university entrance were Included, whereas In 
other countries those who are about to complete 
qualified vocational programmes were also Included. ) 
It would Indeed be preposterous to try to condense the findings from 
the comprehensive Six Subject Survey into a few pages: the report series 
will upon completion consist of nine volumes I We shall therefore confine 



ourselves here to a presentation of some findings which seem t^p have a 
particular bearing on the evaluation of education in UXJ's, particularly 
since this is the first time that qualitative comparisons between iiidus- 
trialised and UW's have been made according to. agreed-upon international 
yaixlsticks. 

Table 1 shows the means and standard deviations in total science 
score and total reading comprehension sco^ in the 19 participating countries, 
of which four are mainly less developed. We have limited ourselves to these 
^two cognitive criteria, since data on them are available for four and three 
IDC's respectively. The only IDC which participated in literature was Chile,, 
which also participated in English and French. Iran was the only IDC 
participating in civics. 

The most dramatic difference is the one between the industrialised 
and non- industrialised countries. The latter are consistently far behind 
the former in average achievement over subject areas and levels of schooling. 
In science the IDCs* score was roughly one standard deviation or more below 
the more developed. This means, then, that in science the average student 
in a IDC scores between the 10th and 12th percentilfe in a .developed country. 
The difference is' even more pronounced in reading comprehension, where only 
some 5 to 10 per cent of the students in the IDC's score at the level of the 
iaverage student in a more developed country. Chile participated, as mentioned 
above, in the survey of French and English as foreign languages and Iran in ' 
civics, -nie mean cognitive scores in both cases turned out to be on the 
same relative level as in science and reading. 

What explanations can be advanced for such big differences ? In 
the first place, we must en^jhatlcally caiitibri against any premature con- 
clusions about the 'productivity' or 'efficacy' of the school systems in 
the two types of countries on the basis of the mean scores presented in 
Jlftfeia^. The differences that we find between the industrialised countries 
are negligible in comparison with the gap between the two categories of 
countries. There is, however, no reason to believe that the rich countries 
all are on the same level of 'efficacy' as regards their school systems. 

A first-hand explanation that would seem plausible is that the tests 
are not doing Justice to the children in the IDC's. The tests might draw 
upon knowledge and learning experiences that are more predominant in the 
riqh countries; Furthermore, the test situation as such and the 
foifmat of assessing the outcomes of learning might imply a certain cultural 

' I ■-. ' "•■ ,.17- : 



Table 1. Mean total score and standard deviation in science and reading comprehension 



among lO-year-olds^ l4-year-oldSj and pre-unlverslty students 



10-ye<ir-olds 



SCIENCE 
l<4-year*-olds 



Pre-university 
students 



READING COMPREHENSION 

IO-year~olds * 14-ycar-olds P re-un ivcr?: i Ly ^ 

students 





M 


SD 


M 

t 


SD 


.. M 


SD 


M 


SD 


M 


SD 


M 


SD 


^ustriUia 




"V 


24.6 


13.4 


24.7 


10.7 




— 




— 


— 




Belgium 
[Flemish) 


• 17.9 


7.3 


21.2 


9.2 


17.4 


8.1 


17.5 


10.2 


24.6 


9.7 


25.0 


9.3 


felgium 
; French) 


13.9 


7.1, 


15.4 


8.8 


15.3 ' 


7.9 


17.9 




27.2 


8. 7 


.'7.6 


9.2 


Sngland 


15. 7 


. 8.5 


21.1 


14. L 


; 23.1 • 


11.5^x 


38.5 


11 .f: 


25.3 


11.^ 


33.6 


9.0 


\R.O. 


U,9 


.7. A 


23-. 7 


11.5 


26.9 


8.9 


— 








— 


— 


'inland 


17.5 


8.2 


20.5 


10.6 


19.8 


9.8 


19.4 


10. S 


27.1 


10.9 


30.0 


7.5 


^rance 










18. 3 


8.7 














lungary 


16. 7 


8.0 


29.1 


12.7 


2.3.0 


9.0 


14.0 


9.8 


25.5 


9.9 


: 3. s . 




Israel 














13. 8 


n.o 


22.6 


i2.e 


2').2 


10.6 


4aiy 


16.5 


\. 6 


' li^.5 


10.2 


15.9 


8.^ 


19.9 


8.8 


27.9 


9.3 


2 3.9 


10.2 


apan- 


21.7 


7.7 


31.2 


14.8 


















tethetiands 


15.3 


7.6 


17.8 ■ 


10.0 


23.3' 


11.1 


17.7 


9.5 


25.2 


• 10.2 


.^1 .2 


7.0 

• 


leu;. Zeal and 






.24:2 


12.9 


29.0 


11.6 






29.3 


11.0 


35,4 


8.1 






ft L 




1 A 7 




• 12 .1 


• 18.4 


11. 1 


27.0 


11.5 


34.4 


8.2 


veden 


18.3 


7.3 


21.7 


11.7 


19.2 


10.2 


21.5 


10. 5 


25.6 


10. S 


26 . 8 


9. J 


altttd States 


17.7 


9.3 


21.6 


11.6 


13.7 


9.5 


16.8 


11.6 


2 /. 3 


11.6 


21.8 


12.0 


ndustriaiized 
Countries 


16.7 


7.9 


22.3 


11.8 


20.9 


9.9 
















9.1 


8.6 


9.2 


8.^ 


8.8 


6.0 


9.1 


9.3 


14.1 


11.1 


16.0 


8.8;- 


idla^ 


8.5 


3.3 


7.6 ^ 


9.0 


6.0. 


6.0 


8.5 


9.4 


: '5.2 


7.2 




5.8 


ran 


4.1 


5.4 


7.8 


6.1 


10.2 


5.6 


3,7 


6.9 


7.8 


6. 7 


4.4 


6.0 


i^lland ' . 


9.9 


6.5 


15.*6 


8.1 


12.4 


6.1 















India samples the Hindi -speaking states or regions only. 
2 - 

Thailand did not test a national sample, but samples schools in the Bangkok area only. 



bias against students In IDC's. We certainly cannot entirely refute such 
hypotheses, but thay do not get much support from the empirical evidence 
we have. In the first place, the content of the tests. I.e. the Individual 
test Items, went through a long procedure of scrutiny and try-out before 
they were 'passed' by all the national subject t\rea coimilttees and Included 
In the International tests. Secondly, the rank order of cliff Icultles of 
Items tended to be highly correlated over countries, whlcK Indicates that 
differences In total scores between countries are not so much accounted for 
by differences In particular sub-areas or topics of a particular subject 
as by systematic dlfferenc?s In level of competence. The teachers were 
asked to rate, on a four-point scale, each Item In the tests with regard 
to what opportunity the students In their classes had had to learn the 
subject matter that was assessed by the Item. As far as science Is con- 
cerned, the average opportunity tended to be somewhat lower for Populations 
II and IV In the liXJ's (see Comber and Keeves, 1973). However, these 
differences In opportunity can by no means er.plalh ,more than a small portion 
of the difference In mean performance. 

The main factor Is no doubt the socio-economic gap between the two 
categories of countries. ESducatlon does not operate In a soclo-ecoriomlc 
vacuum, a fact which is shown not the least by the oons*stei^tly substantial 
correlations between various family background measures and student achieve- 
ment In all subject areas. Passow, Noah and Eckstein have. In their report 
on the 'National Case Study Questionnaire' (in press), drawn up 'national 
profiles' for the 19 countries which participated In the first stage of the 
Six Subject Survey. The size of the per capita GNP varies from about 
US$ 1,400 to 4,:500 In the Industrialised countries, whereas It varies from 
$90 to 270 In the IDC's which took part In the study. The size of the non- 
primary sector of the economy In per cent of the ON? Is In most cases 90 to 
95 per cent In the rloh oountrles as compared to 50 to 75 per cent In the 
IDC's. The difference Is even more marked If we measure the size In terms 
of number of people employed In the primary and non-prlroary sectors 
respectively. 

Thus, the difference between developed and less developed countries 
could' be expected, considering the overall socio-economic setting for the 
school systems In thje two categories of countries. The outcomes of the 
multl,v«rlate analyses tell us that the total effect of home background 
variables In both science and reading Is greater than the total effect of 



all the school variables. Among the 10- year-olds, 35 per cent of the 
variation between students can be attributed to family background and 
22 per cent to school factors, including, of course, all the instructional 
factors. The corresponding figures for the l4-year-olds are 42' and 26 per- 
cent respectively. What is 'family background' then ? After a careful 
study of some 20 variables that could be considered as candidates for an 
overall measure of social background, the following were selected to form 
a composite 'School Handicap Score' (SKS):- (1)' Father's Occupation, 
(2) Father's education, (?) Mother's education, (4) Use of dictionary at 
home, (5) Number of books at home, and (6) Family size. It is pointed out. 
In the international report in science, that the "effectiveness of the 
education provided by the school must be assessed by what is achieved, after 
allowance has been made for the nature of the coimiunity in which the school 
is operating" (Comber and Keeves, 1973,, p. 195). ihuS", regardless of the 
quality of the formal- educational system, we can, on the basis of the Impact 
of the family background factors, predict a large difference tn mean achieve- 
ment between the less and the more industrialised countries. Parents in 
the former type of countries are in most cases illiterate and no reading ' 
material is available at home. On the whole, the verbal environment in 
which the children grow up is almost entirely oral and there are rather few 
occasions in which r-eadirig skills picked up at school can be reinforced by 
experiences at home. 

A simple reading speed test was developed in order to measure to 
what extent the mechanics of reading skills had been acquired. The items 
consisted of short paragraphs of two or three simple sentences, and the 
students by checking the right answer of a choice of three had to indicate 
that he had understood what he had read. The items were like thig. 
"Peter has a little dog. The dog in blaok with a 
white spot on his back and one white leg. The colour 
of Peter's dog is mostly; 

black broWn grey." - 

On the average, lo-year-olds in Europe had an error rate of about 
10 per cent on items such as the one cited. At the l4-year-old level the 
rate had gone down to about 4 per cent. For the three ];dc's the rates 
were: 
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10-year-olds 14-year-olds 
Chile 2f>% \e% 

India *36^ 33^ 

Iran 52Jg . 20jC 

Therefore, there is sow Justification for what was said earlier 
that quite a few of the 10- and l^-year-olds In the IIXJ's have not been 
able to read the science Items and the questions In the student question- 
naires. 

5» The establishment of research competencies In education In IIX?* s 

The TEA survey research, conducted over more than ten years, is 
indeed a highly sojshisticated one. Therefore, doubts have been raised, 
not least in international agencies involved in technicial assistance in one 
way or another in -iDC's, as to whether the techniques develop^ by lEA might 
not be too sophistio^ted to become part of routine evaluation procedures in 
these countries. 

Since four LDC*s participated! in the Six Subject Survey along with 
15 njore or less industrialised countries, it would seem in oirier at this 
Juncture to take stock of the experience which has been gained. 

In the first place the participating institutions have accumulated 
a vast experience in* terms of research strategies and techniques related 
to the evaluation of national systems of education, llie lEA international 
headquarters as well as the national centres have over the years coopera- 
tively tui^t up a considerable amount of collective competence with regard 
to the conceptualisation of evaluation research, the appropriate techniques 
foi' dealing with different kinds of problems and the modes of feedback to 
policy makers in the countries concerned. The completed studies have had 
an impact on purely pedagogical matters, such as curriculum development and 
the proyisionxpf instructional facilities, but also on considerations related 
to the structure of the school systems. 

In spite of .the obvious limitations and drawbacks that the applica- 
tion of the lEA methodology had in some IDC^s and which >iave been dealt 
with above, I think that for tWo major reasons the experiences gained 
(v4iich we, of course, have to tkke stock of) make a case for further develop- 
mental work that would in the long run make these techniques a routine 
prooedure in evaluating the systems of \ducation in IJDC*s. 
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In the first place, the major advantage that I see as the most 
^encouraging experience from the lEA Six Subject Survey is Its contribution 
to the build-up of research competence in the participating national centres. 
Those of us who were responsible for the technical and administrative*. co- 
ordination of the project have found how, in spite of scepticism^and bad 
odds, the research competencies in the IDC's especially were tremendously 
developed. 

In some countries this was the first time a sample survey in education 
had been conducted and by bringing together the technical officers to inter- 
national seminars and briefings or by dispatching experts from the lEA 
headquarters, the techni^e of drawing nationally representative samples 
had to be learned by actual practice. Examinations are in some countries 
conducted nation-wide with .instruments which cannot be quickly and objecti- 
vely scored, since they are essay examinations. The development and tryout 
of the lEA international achievement tests in these countries was another 
lesson learned by actually ca. -ying out the procedure. Finally, the tech- 
niques that can be used in data processing in education and making such d^ta 
available to statistical analyses conducive to finding out what factors 
account for between- student and between-school differences In achievement 
nad to be learned in the same way. 

It was, however, a matter not only of trying to build competence in 
conducting evaluation surveys but also of making those who were involved 
aware of certain features of their own national system of education by 
broadening their perspective to encompass a series of other systems. The 
National Case Study Questionnaire, which had to be completed by each of the 
national centres, aimed at collecting information not pnly about overall 
features of the respective national systems of education as such, but also 
about the social and economic settings in which the systems were operating. 

Those who were responsible for conducting the survey, not least 
those in the IDC's, learned a lot about their own systems which they did 
not have a 'concrete idea about before. A national survey of the educational 
system in a country, with all its limitations and technical snags, provides 
findings which can be brought to baar on educational policy and planning. 
So far we have Iciown very little about what factors account for differences 
between schools and students in achievement. We have, for instance, not 
been aware of the fact that the varl>>ua factors in the home background do 
not play the same role in many IiX3's as they do in the highly industrialised 
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countries. Western 'standard' background variables, such as father's 
occupational status and parental education seem to account much less for 
differences between students In achievement In the IDC's than m the highly 
industrlallsied countries. 

A detailed ahalysls of student performance In a particular subject 
area can provide valuable feedback to curriculum developers. This Is 
particularly useful for curriculum development In IDC's, since there has 
been a strong tendency to adopt subject matters as defined by textbooks 
m the industrialised countries without closer consideration of the 
particular needs and circumstances In the borrowing country. 

Finally, it should be pointed out that the evaluation techniques -., 
employed by lEA are In principle applicable to both the formal and. informal 
educational system. Individuals have . to be sampled In a representative - 
way. Yardsticks of performance as well as of attitudes have to be 
developed. Questionnaires administered to students, teachers and adminis- 
trators have to be devised m order to collect relevant background Infor- 
mation. Such information, by the way, is not always available, simply 
because it might never have been the object of any kind of surveyor census". 
°' Conoludlnp: remarks 

It is by no means a coincidence that international co-operative survey 
research in education started with evaluation problems. Before one" can 
begin to investigate to what extent various factors account for differences 
between classrooms, school and entire national system, of formal education, 
it is necessary to develop International criteria of evaluation. W« con- 
struction international instruments that can be used in evaluating bo 
the cognitive and non-cognitive outcomes of instructloij is in itself an 
important research accomplishment. But It is only the first step on the 
way to the ultimate goal which is to identify the salient factors which 
account. for differences between systems and to explain why they differ. 
3y means of such research it will be possible to establish international 
indicators of the qualitative. outcomes of school education. One would 
thereby also be able to inform planners and policy-makers about what 
indicators are worthwhile to manipulate in terms of policy action. 
/ Closely related to this is the problem of how the 'productivity' 
f • natlonal„ system of school educatiph should be assessed. Tbo long 
ave we tended to evaluate the outoomei m terms of the number of indivi- 
tiials who are enrolled at a particular stage in\ the system or in terms of . 



•how many years they have completed rather than by the coispetence they ftave 
achieved. A certain amount of schooling in terms, of number of years or a 
particular certificate can by no means be regarded^ as comparable quantities 
from one system to another. Furthermore, it is not satisfactory, when 
evaluating its quality, to limit oneself -to the end products 'of a system. 
One has also to consider its power to take care of and impart competence 
in all students who enter the system. Since attrition, particularly in 
terms of drop-outs, is in many systems very high, one basic question that 
needs to be answered in evaluating a system is: - How "many students are 
brought how far ? 

As far as the evaluation of national systems of education in the 
WC's is concerned, the lEA research has brought about the accumulation 
of strategies and techniques which can begin to be utilised routinely. 
Methods of analysing national curricula in terms of the goals which are 
to be achieved have been developed. Similarly, techriiques have been \ 
devised by means of which instruments can be constructed to measure these 
goals. Procedures for drawing probability samples from target populations 
• under consideration have been developed. Routines for data collection 
in the schools have been tried out in a wide variety of contexts. Finally, 
experience has been gained in data processing of particular relevance to 
nation-wide evaluation surveys. 

The lEA international headquarters,^ as well as the national centres, 
have over the last ten years built up a considerable' aiBount of collective 
competence with regard to the conceptualisation of reVarch problems 
connected with evaluation, the techniques emplpyed and the different modes 
of feedback to policy-makers in the countries concerned. The co-operative 
machinery that has been" built up could be utilise to provide training 
programmes „for students from regions of the world where particulars strengths 
and competencies in evaluation are still developing. From the isA inter- 
national network one could set up task forces to work' with centres in LDG's.' 
Such forces could co-operate with local researchers on designing evaluation 
svirveys. 
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